{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "CcROQcVVoUiH"
   },
   "source": [
    "<a href=\"https://colab.research.google.com/github/jeffheaton/t81_558_deep_learning/blob/master/t81_558_class_08_4_bayesian_hyperparameter_opt.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "Di3an3U1oUiJ"
   },
   "source": [
    "# T81-558: Applications of Deep Neural Networks\n",
    "**Module 8: Kaggle Data Sets**\n",
    "* Instructor: [Jeff Heaton](https://sites.wustl.edu/jeffheaton/), McKelvey School of Engineering, [Washington University in St. Louis](https://engineering.wustl.edu/Programs/Pages/default.aspx)\n",
    "* For more information visit the [class website](https://sites.wustl.edu/jeffheaton/t81-558/)."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "V1W44MImoUiJ"
   },
   "source": [
    "# Module 8 Material\n",
    "\n",
    "* Part 8.1: Introduction to Kaggle [[Video]](https://www.youtube.com/watch?v=v4lJBhdCuCU&list=PLjy4p-07OYzulelvJ5KVaT2pDlxivl_BN) [[Notebook]](https://github.com/jeffheaton/t81_558_deep_learning/blob/master/t81_558_class_08_1_kaggle_intro.ipynb)\n",
    "* Part 8.2: Building Ensembles with Scikit-Learn and Keras [[Video]](https://www.youtube.com/watch?v=LQ-9ZRBLasw&list=PLjy4p-07OYzulelvJ5KVaT2pDlxivl_BN) [[Notebook]](https://github.com/jeffheaton/t81_558_deep_learning/blob/master/t81_558_class_08_2_keras_ensembles.ipynb)\n",
    "* Part 8.3: How Should you Architect Your Keras Neural Network: Hyperparameters [[Video]](https://www.youtube.com/watch?v=1q9klwSoUQw&list=PLjy4p-07OYzulelvJ5KVaT2pDlxivl_BN) [[Notebook]](https://github.com/jeffheaton/t81_558_deep_learning/blob/master/t81_558_class_08_3_keras_hyperparameters.ipynb)\n",
    "* **Part 8.4: Bayesian Hyperparameter Optimization for Keras** [[Video]](https://www.youtube.com/watch?v=sXdxyUCCm8s&list=PLjy4p-07OYzulelvJ5KVaT2pDlxivl_BN) [[Notebook]](https://github.com/jeffheaton/t81_558_deep_learning/blob/master/t81_558_class_08_4_bayesian_hyperparameter_opt.ipynb)\n",
    "* Part 8.5: Current Semester's Kaggle [[Video]](https://www.youtube.com/watch?v=PHQt0aUasRg&list=PLjy4p-07OYzulelvJ5KVaT2pDlxivl_BN) [[Notebook]](https://github.com/jeffheaton/t81_558_deep_learning/blob/master/t81_558_class_08_5_kaggle_project.ipynb)\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "ISBSbu8XoUiK"
   },
   "source": [
    "# Google CoLab Instructions\n",
    "\n",
    "The following code ensures that Google CoLab is running the correct version of TensorFlow."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "SzfTEj7foUiK",
    "outputId": "daf55884-f88a-4d7b-f96e-d1285ad2de79"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Note: using Google CoLab\n"
     ]
    }
   ],
   "source": [
    "# Startup Google CoLab\n",
    "try:\n",
    "    %tensorflow_version 2.x\n",
    "    COLAB = True\n",
    "    print(\"Note: using Google CoLab\")\n",
    "except:\n",
    "    print(\"Note: not using Google CoLab\")\n",
    "    COLAB = False\n",
    "\n",
    "# Nicely formatted time string\n",
    "def hms_string(sec_elapsed):\n",
    "    h = int(sec_elapsed / (60 * 60))\n",
    "    m = int((sec_elapsed % (60 * 60)) / 60)\n",
    "    s = sec_elapsed % 60\n",
    "    return \"{}:{:>02}:{:>05.2f}\".format(h, m, s)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "WydBc__qoUiL"
   },
   "source": [
    "# Part 8.4: Bayesian Hyperparameter Optimization for Keras\n",
    "\n",
    "Bayesian Hyperparameter Optimization is a method of finding hyperparameters more efficiently than a grid search. Because each candidate set of hyperparameters requires a retraining of the neural network, it is best to keep the number of candidate sets to a minimum. Bayesian Hyperparameter Optimization achieves this by training a model to predict good candidate sets of hyperparameters. [[Cite:snoek2012practical]](https://arxiv.org/pdf/1206.2944.pdf)\n",
    "\n",
    "* [bayesian-optimization](https://github.com/fmfn/BayesianOptimization)\n",
    "* [hyperopt](https://github.com/hyperopt/hyperopt)\n",
    "* [spearmint](https://github.com/JasperSnoek/spearmint)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {
    "id": "NNKv5zWgoUiL"
   },
   "outputs": [],
   "source": [
    "# Ignore useless W0819 warnings generated by TensorFlow 2.0.  \n",
    "# Hopefully can remove this ignore in the future.\n",
    "# See https://github.com/tensorflow/tensorflow/issues/31308\n",
    "import logging, os\n",
    "logging.disable(logging.WARNING)\n",
    "os.environ[\"TF_CPP_MIN_LOG_LEVEL\"] = \"3\"\n",
    "\n",
    "import pandas as pd\n",
    "from scipy.stats import zscore\n",
    "\n",
    "# Read the data set\n",
    "df = pd.read_csv(\n",
    "    \"https://data.heatonresearch.com/data/t81-558/jh-simple-dataset.csv\",\n",
    "    na_values=['NA','?'])\n",
    "\n",
    "# Generate dummies for job\n",
    "df = pd.concat([df,pd.get_dummies(df['job'],prefix=\"job\")],axis=1)\n",
    "df.drop('job', axis=1, inplace=True)\n",
    "\n",
    "# Generate dummies for area\n",
    "df = pd.concat([df,pd.get_dummies(df['area'],prefix=\"area\")],axis=1)\n",
    "df.drop('area', axis=1, inplace=True)\n",
    "\n",
    "# Missing values for income\n",
    "med = df['income'].median()\n",
    "df['income'] = df['income'].fillna(med)\n",
    "\n",
    "# Standardize ranges\n",
    "df['income'] = zscore(df['income'])\n",
    "df['aspect'] = zscore(df['aspect'])\n",
    "df['save_rate'] = zscore(df['save_rate'])\n",
    "df['age'] = zscore(df['age'])\n",
    "df['subscriptions'] = zscore(df['subscriptions'])\n",
    "\n",
    "# Convert to numpy - Classification\n",
    "x_columns = df.columns.drop('product').drop('id')\n",
    "x = df[x_columns].values\n",
    "dummies = pd.get_dummies(df['product']) # Classification\n",
    "products = dummies.columns\n",
    "y = dummies.values"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "9qetzyCcoUiM"
   },
   "source": [
    "Now that we've preprocessed the data, we can begin the hyperparameter optimization.  We start by creating a function that generates the model based on just three parameters.  Bayesian optimization works on a vector of numbers, not on a problematic notion like how many layers and neurons are on each layer.  To represent this complex neuron structure as a vector, we use several numbers to describe this structure.   \n",
    "\n",
    "* **dropout** - The dropout percent for each layer.\n",
    "* **neuronPct** - What percent of our fixed 5,000 maximum number of neurons do we wish to use?  This parameter specifies the total count of neurons in the entire network.\n",
    "* **neuronShrink** - Neural networks usually start with more neurons on the first hidden layer and then decrease this count for additional layers.  This percent specifies how much to shrink subsequent layers based on the previous layer.  We stop adding more layers once we run out of neurons (the count specified by neuronPct).\n",
    "\n",
    "These three numbers define the structure of the neural network.  The commends in the below code show exactly how the program constructs the network."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "id": "d6j5HvCqoUiM"
   },
   "outputs": [],
   "source": [
    "import pandas as pd\n",
    "import os\n",
    "import numpy as np\n",
    "import time\n",
    "import tensorflow.keras.initializers\n",
    "import statistics\n",
    "import tensorflow.keras\n",
    "from sklearn import metrics\n",
    "from sklearn.model_selection import StratifiedKFold\n",
    "from tensorflow.keras.models import Sequential\n",
    "from tensorflow.keras.layers import Dense, Activation, Dropout, InputLayer\n",
    "from tensorflow.keras import regularizers\n",
    "from tensorflow.keras.callbacks import EarlyStopping\n",
    "from sklearn.model_selection import StratifiedShuffleSplit\n",
    "from sklearn.model_selection import ShuffleSplit\n",
    "from tensorflow.keras.layers import LeakyReLU,PReLU\n",
    "from tensorflow.keras.optimizers import Adam\n",
    "\n",
    "def generate_model(dropout, neuronPct, neuronShrink):\n",
    "    # We start with some percent of 5000 starting neurons on \n",
    "    # the first hidden layer.\n",
    "    neuronCount = int(neuronPct * 5000)\n",
    "    \n",
    "    # Construct neural network\n",
    "    model = Sequential()\n",
    "\n",
    "    # So long as there would have been at least 25 neurons and \n",
    "    # fewer than 10\n",
    "    # layers, create a new layer.\n",
    "    layer = 0\n",
    "    while neuronCount>25 and layer<10:\n",
    "        # The first (0th) layer needs an input input_dim(neuronCount)\n",
    "        if layer==0:\n",
    "            model.add(Dense(neuronCount, \n",
    "                input_dim=x.shape[1], \n",
    "                activation=PReLU()))\n",
    "        else:\n",
    "            model.add(Dense(neuronCount, activation=PReLU())) \n",
    "        layer += 1\n",
    "\n",
    "        # Add dropout after each hidden layer\n",
    "        model.add(Dropout(dropout))\n",
    "\n",
    "        # Shrink neuron count for each layer\n",
    "        neuronCount = neuronCount * neuronShrink\n",
    "\n",
    "    model.add(Dense(y.shape[1],activation='softmax')) # Output\n",
    "    return model"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "LGfXROENoUiN"
   },
   "source": [
    "We can test this code to see how it creates a neural network based on three such parameters."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "nCQC86b_oUiN",
    "outputId": "992541aa-360a-4b19-80d9-0cf01a515256"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Model: \"sequential\"\n",
      "_________________________________________________________________\n",
      " Layer (type)                Output Shape              Param #   \n",
      "=================================================================\n",
      " dense (Dense)               (None, 500)               24500     \n",
      "                                                                 \n",
      " dropout (Dropout)           (None, 500)               0         \n",
      "                                                                 \n",
      " dense_1 (Dense)             (None, 125)               62750     \n",
      "                                                                 \n",
      " dropout_1 (Dropout)         (None, 125)               0         \n",
      "                                                                 \n",
      " dense_2 (Dense)             (None, 31)                3937      \n",
      "                                                                 \n",
      " dropout_2 (Dropout)         (None, 31)                0         \n",
      "                                                                 \n",
      " dense_3 (Dense)             (None, 7)                 224       \n",
      "                                                                 \n",
      "=================================================================\n",
      "Total params: 91,411\n",
      "Trainable params: 91,411\n",
      "Non-trainable params: 0\n",
      "_________________________________________________________________\n"
     ]
    }
   ],
   "source": [
    "# Generate a model and see what the resulting structure looks like.\n",
    "model = generate_model(dropout=0.2, neuronPct=0.1, neuronShrink=0.25)\n",
    "model.summary()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "vgY6GprqoUiN"
   },
   "source": [
    "We will now create a function to evaluate the neural network using three such parameters.  We use bootstrapping because one training run might have \"bad luck\" with the assigned random weights.  We use this function to train and then evaluate the neural network.  "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {
    "id": "kbX4k-ynoUiN"
   },
   "outputs": [],
   "source": [
    "SPLITS = 2\n",
    "EPOCHS = 500\n",
    "PATIENCE = 10\n",
    "\n",
    "def evaluate_network(dropout,learning_rate,neuronPct,neuronShrink):\n",
    "    # Bootstrap\n",
    "\n",
    "    # for Classification\n",
    "    boot = StratifiedShuffleSplit(n_splits=SPLITS, test_size=0.1)\n",
    "    # for Regression\n",
    "    # boot = ShuffleSplit(n_splits=SPLITS, test_size=0.1)\n",
    "\n",
    "    # Track progress\n",
    "    mean_benchmark = []\n",
    "    epochs_needed = []\n",
    "    num = 0\n",
    "    \n",
    "    # Loop through samples\n",
    "    for train, test in boot.split(x,df['product']):\n",
    "        start_time = time.time()\n",
    "        num+=1\n",
    "\n",
    "        # Split train and test\n",
    "        x_train = x[train]\n",
    "        y_train = y[train]\n",
    "        x_test = x[test]\n",
    "        y_test = y[test]\n",
    "\n",
    "        model = generate_model(dropout, neuronPct, neuronShrink)\n",
    "        model.compile(loss='categorical_crossentropy', \n",
    "                      optimizer=Adam(learning_rate=learning_rate))\n",
    "        monitor = EarlyStopping(monitor='val_loss', min_delta=1e-3, \n",
    "        patience=PATIENCE, verbose=0, mode='auto', \n",
    "                                restore_best_weights=True)\n",
    "\n",
    "        # Train on the bootstrap sample\n",
    "        model.fit(x_train,y_train,validation_data=(x_test,y_test),\n",
    "                  callbacks=[monitor],verbose=0,epochs=EPOCHS)\n",
    "        epochs = monitor.stopped_epoch\n",
    "        epochs_needed.append(epochs)\n",
    "\n",
    "        # Predict on the out of boot (validation)\n",
    "        pred = model.predict(x_test)\n",
    "\n",
    "        # Measure this bootstrap's log loss\n",
    "        y_compare = np.argmax(y_test,axis=1) # For log loss calculation\n",
    "        score = metrics.log_loss(y_compare, pred)\n",
    "        mean_benchmark.append(score)\n",
    "        m1 = statistics.mean(mean_benchmark)\n",
    "        m2 = statistics.mean(epochs_needed)\n",
    "        mdev = statistics.pstdev(mean_benchmark)\n",
    "\n",
    "        # Record this iteration\n",
    "        time_took = time.time() - start_time\n",
    "        \n",
    "    tensorflow.keras.backend.clear_session()\n",
    "    return (-m1)\n",
    "\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "eYMNfmXPoUiN"
   },
   "source": [
    "You can try any combination of our three hyperparameters, plus the learning rate, to see how effective these four numbers are.  Of course, our goal is not to manually choose different combinations of these four hyperparameters; we seek to automate."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "ZpyYvzKnoUiN",
    "outputId": "7b03f355-2a76-498d-a37a-fa06631f0a7f"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "-0.6668764846259546\n"
     ]
    }
   ],
   "source": [
    "print(evaluate_network(\n",
    "    dropout=0.2,\n",
    "    learning_rate=1e-3,\n",
    "    neuronPct=0.2,\n",
    "    neuronShrink=0.2))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "b05ssrH0qfdz"
   },
   "source": [
    "First, we must install the Bayesian optimization package if we are in Colab."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "S2iFisjIqjvO",
    "outputId": "a41238ba-deb5-47fe-fc98-6a505fac9021"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Requirement already satisfied: bayesian-optimization in /usr/local/lib/python3.7/dist-packages (1.2.0)\n",
      "Requirement already satisfied: scipy>=0.14.0 in /usr/local/lib/python3.7/dist-packages (from bayesian-optimization) (1.4.1)\n",
      "Requirement already satisfied: scikit-learn>=0.18.0 in /usr/local/lib/python3.7/dist-packages (from bayesian-optimization) (1.0.2)\n",
      "Requirement already satisfied: numpy>=1.9.0 in /usr/local/lib/python3.7/dist-packages (from bayesian-optimization) (1.21.5)\n",
      "Requirement already satisfied: threadpoolctl>=2.0.0 in /usr/local/lib/python3.7/dist-packages (from scikit-learn>=0.18.0->bayesian-optimization) (3.1.0)\n",
      "Requirement already satisfied: joblib>=0.11 in /usr/local/lib/python3.7/dist-packages (from scikit-learn>=0.18.0->bayesian-optimization) (1.1.0)\n"
     ]
    }
   ],
   "source": [
    "# HIDE OUTPUT\n",
    "!pip install bayesian-optimization"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "Vh3nx_SDoUiO"
   },
   "source": [
    "We will now automate this process. We define the bounds for each of these four hyperparameters and begin the Bayesian optimization. Once the program finishes, the best combination of hyperparameters found is displayed. The **optimize** function accepts two parameters that will significantly impact how long the process takes to complete: \n",
    "\n",
    "* **n_iter** - How many steps of Bayesian optimization that you want to perform. The more steps, the more likely you will find a reasonable maximum.\n",
    "* **init_points**: How many steps of random exploration that you want to perform. Random exploration can help by diversifying the exploration space."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "FiymzM63oUiO",
    "outputId": "278deaac-2c57-4929-b739-5685ddab06dd"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "|   iter    |  target   |  dropout  | learni... | neuronPct | neuron... |\n",
      "-------------------------------------------------------------------------\n",
      "| \u001b[0m 1       \u001b[0m | \u001b[0m-0.8092  \u001b[0m | \u001b[0m 0.2081  \u001b[0m | \u001b[0m 0.07203 \u001b[0m | \u001b[0m 0.01011 \u001b[0m | \u001b[0m 0.3093  \u001b[0m |\n",
      "| \u001b[95m 2       \u001b[0m | \u001b[95m-0.7167  \u001b[0m | \u001b[95m 0.07323 \u001b[0m | \u001b[95m 0.009234\u001b[0m | \u001b[95m 0.1944  \u001b[0m | \u001b[95m 0.3521  \u001b[0m |\n",
      "| \u001b[0m 3       \u001b[0m | \u001b[0m-17.87   \u001b[0m | \u001b[0m 0.198   \u001b[0m | \u001b[0m 0.05388 \u001b[0m | \u001b[0m 0.425   \u001b[0m | \u001b[0m 0.6884  \u001b[0m |\n",
      "| \u001b[0m 4       \u001b[0m | \u001b[0m-0.8022  \u001b[0m | \u001b[0m 0.102   \u001b[0m | \u001b[0m 0.08781 \u001b[0m | \u001b[0m 0.03711 \u001b[0m | \u001b[0m 0.6738  \u001b[0m |\n",
      "| \u001b[0m 5       \u001b[0m | \u001b[0m-0.9209  \u001b[0m | \u001b[0m 0.2082  \u001b[0m | \u001b[0m 0.05587 \u001b[0m | \u001b[0m 0.149   \u001b[0m | \u001b[0m 0.2061  \u001b[0m |\n",
      "| \u001b[0m 6       \u001b[0m | \u001b[0m-17.96   \u001b[0m | \u001b[0m 0.3996  \u001b[0m | \u001b[0m 0.09683 \u001b[0m | \u001b[0m 0.3203  \u001b[0m | \u001b[0m 0.6954  \u001b[0m |\n",
      "| \u001b[0m 7       \u001b[0m | \u001b[0m-4.223   \u001b[0m | \u001b[0m 0.4373  \u001b[0m | \u001b[0m 0.08946 \u001b[0m | \u001b[0m 0.09419 \u001b[0m | \u001b[0m 0.04866 \u001b[0m |\n",
      "| \u001b[95m 8       \u001b[0m | \u001b[95m-0.7025  \u001b[0m | \u001b[95m 0.08475 \u001b[0m | \u001b[95m 0.08781 \u001b[0m | \u001b[95m 0.1074  \u001b[0m | \u001b[95m 0.4269  \u001b[0m |\n",
      "| \u001b[0m 9       \u001b[0m | \u001b[0m-8.666   \u001b[0m | \u001b[0m 0.478   \u001b[0m | \u001b[0m 0.05332 \u001b[0m | \u001b[0m 0.695   \u001b[0m | \u001b[0m 0.3224  \u001b[0m |\n",
      "| \u001b[0m 10      \u001b[0m | \u001b[0m-9.785   \u001b[0m | \u001b[0m 0.3426  \u001b[0m | \u001b[0m 0.08346 \u001b[0m | \u001b[0m 0.02811 \u001b[0m | \u001b[0m 0.7526  \u001b[0m |\n",
      "| \u001b[0m 11      \u001b[0m | \u001b[0m-4.881   \u001b[0m | \u001b[0m 0.0     \u001b[0m | \u001b[0m 0.0     \u001b[0m | \u001b[0m 0.01    \u001b[0m | \u001b[0m 0.01    \u001b[0m |\n",
      "| \u001b[0m 12      \u001b[0m | \u001b[0m-21.59   \u001b[0m | \u001b[0m 0.2208  \u001b[0m | \u001b[0m 0.04135 \u001b[0m | \u001b[0m 0.5523  \u001b[0m | \u001b[0m 0.7468  \u001b[0m |\n",
      "| \u001b[0m 13      \u001b[0m | \u001b[0m-1.819   \u001b[0m | \u001b[0m 0.0     \u001b[0m | \u001b[0m 0.0     \u001b[0m | \u001b[0m 1.0     \u001b[0m | \u001b[0m 0.01    \u001b[0m |\n",
      "| \u001b[0m 14      \u001b[0m | \u001b[0m-33.33   \u001b[0m | \u001b[0m 0.01058 \u001b[0m | \u001b[0m 0.08079 \u001b[0m | \u001b[0m 0.9652  \u001b[0m | \u001b[0m 0.7051  \u001b[0m |\n",
      "| \u001b[0m 15      \u001b[0m | \u001b[0m-1.418   \u001b[0m | \u001b[0m 0.4963  \u001b[0m | \u001b[0m 0.02476 \u001b[0m | \u001b[0m 0.9744  \u001b[0m | \u001b[0m 0.01896 \u001b[0m |\n",
      "| \u001b[0m 16      \u001b[0m | \u001b[0m-1.876   \u001b[0m | \u001b[0m 0.1247  \u001b[0m | \u001b[0m 0.0     \u001b[0m | \u001b[0m 0.5781  \u001b[0m | \u001b[0m 0.01    \u001b[0m |\n",
      "| \u001b[0m 17      \u001b[0m | \u001b[0m-1.898   \u001b[0m | \u001b[0m 0.0     \u001b[0m | \u001b[0m 0.0     \u001b[0m | \u001b[0m 0.01    \u001b[0m | \u001b[0m 1.0     \u001b[0m |\n",
      "| \u001b[0m 18      \u001b[0m | \u001b[0m-17.7    \u001b[0m | \u001b[0m 0.1621  \u001b[0m | \u001b[0m 0.06358 \u001b[0m | \u001b[0m 0.4065  \u001b[0m | \u001b[0m 0.754   \u001b[0m |\n",
      "| \u001b[0m 19      \u001b[0m | \u001b[0m-2.674   \u001b[0m | \u001b[0m 0.0     \u001b[0m | \u001b[0m 0.0     \u001b[0m | \u001b[0m 0.01    \u001b[0m | \u001b[0m 0.5464  \u001b[0m |\n",
      "| \u001b[0m 20      \u001b[0m | \u001b[0m-1.931   \u001b[0m | \u001b[0m 0.499   \u001b[0m | \u001b[0m 0.0     \u001b[0m | \u001b[0m 0.5538  \u001b[0m | \u001b[0m 0.01    \u001b[0m |\n",
      "| \u001b[0m 21      \u001b[0m | \u001b[0m-3.402   \u001b[0m | \u001b[0m 0.004722\u001b[0m | \u001b[0m 0.05502 \u001b[0m | \u001b[0m 0.1704  \u001b[0m | \u001b[0m 0.483   \u001b[0m |\n",
      "| \u001b[0m 22      \u001b[0m | \u001b[0m-2.8     \u001b[0m | \u001b[0m 0.08639 \u001b[0m | \u001b[0m 0.0838  \u001b[0m | \u001b[0m 0.04668 \u001b[0m | \u001b[0m 0.6864  \u001b[0m |\n",
      "| \u001b[0m 23      \u001b[0m | \u001b[0m-20.98   \u001b[0m | \u001b[0m 0.1168  \u001b[0m | \u001b[0m 0.0447  \u001b[0m | \u001b[0m 0.5546  \u001b[0m | \u001b[0m 0.9497  \u001b[0m |\n",
      "| \u001b[0m 24      \u001b[0m | \u001b[0m-4.565   \u001b[0m | \u001b[0m 0.2554  \u001b[0m | \u001b[0m 0.1     \u001b[0m | \u001b[0m 0.8418  \u001b[0m | \u001b[0m 0.01    \u001b[0m |\n",
      "| \u001b[0m 25      \u001b[0m | \u001b[0m-4.724   \u001b[0m | \u001b[0m 0.0     \u001b[0m | \u001b[0m 0.1     \u001b[0m | \u001b[0m 0.3368  \u001b[0m | \u001b[0m 0.01    \u001b[0m |\n",
      "| \u001b[95m 26      \u001b[0m | \u001b[95m-0.6956  \u001b[0m | \u001b[95m 0.2505  \u001b[0m | \u001b[95m 0.007623\u001b[0m | \u001b[95m 0.01265 \u001b[0m | \u001b[95m 0.523   \u001b[0m |\n",
      "| \u001b[0m 27      \u001b[0m | \u001b[0m-0.7139  \u001b[0m | \u001b[0m 0.2967  \u001b[0m | \u001b[0m 0.01162 \u001b[0m | \u001b[0m 0.3735  \u001b[0m | \u001b[0m 0.01708 \u001b[0m |\n",
      "| \u001b[0m 28      \u001b[0m | \u001b[0m-2.145   \u001b[0m | \u001b[0m 0.499   \u001b[0m | \u001b[0m 0.0     \u001b[0m | \u001b[0m 0.3053  \u001b[0m | \u001b[0m 0.2207  \u001b[0m |\n",
      "| \u001b[0m 29      \u001b[0m | \u001b[0m-2.069   \u001b[0m | \u001b[0m 0.0     \u001b[0m | \u001b[0m 0.0     \u001b[0m | \u001b[0m 0.4808  \u001b[0m | \u001b[0m 0.2473  \u001b[0m |\n",
      "| \u001b[0m 30      \u001b[0m | \u001b[0m-0.7155  \u001b[0m | \u001b[0m 0.4082  \u001b[0m | \u001b[0m 0.01635 \u001b[0m | \u001b[0m 0.02488 \u001b[0m | \u001b[0m 0.1694  \u001b[0m |\n",
      "=========================================================================\n",
      "Total runtime: 1:36:11.56\n",
      "{'target': -0.6955536706512794, 'params': {'dropout': 0.2504561773412203, 'learning_rate': 0.0076232346709142924, 'neuronPct': 0.012648791521811826, 'neuronShrink': 0.5229748831552032}}\n"
     ]
    }
   ],
   "source": [
    "from bayes_opt import BayesianOptimization\n",
    "import time\n",
    "\n",
    "# Supress NaN warnings\n",
    "import warnings\n",
    "warnings.filterwarnings(\"ignore\",category =RuntimeWarning)\n",
    "\n",
    "# Bounded region of parameter space\n",
    "pbounds = {'dropout': (0.0, 0.499),\n",
    "           'learning_rate': (0.0, 0.1),\n",
    "           'neuronPct': (0.01, 1),\n",
    "           'neuronShrink': (0.01, 1)\n",
    "          }\n",
    "\n",
    "optimizer = BayesianOptimization(\n",
    "    f=evaluate_network,\n",
    "    pbounds=pbounds,\n",
    "    verbose=2,  # verbose = 1 prints only when a maximum \n",
    "    # is observed, verbose = 0 is silent\n",
    "    random_state=1,\n",
    ")\n",
    "\n",
    "start_time = time.time()\n",
    "optimizer.maximize(init_points=10, n_iter=20,)\n",
    "time_took = time.time() - start_time\n",
    "\n",
    "print(f\"Total runtime: {hms_string(time_took)}\")\n",
    "print(optimizer.max)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "tZ7xW5UDqaOV"
   },
   "source": [
    "As you can see, the algorithm performed 30 total iterations. This total iteration count includes ten random and 20 optimization iterations. "
   ]
  }
 ],
 "metadata": {
  "anaconda-cloud": {},
  "colab": {
   "collapsed_sections": [],
   "name": "test.ipynb",
   "provenance": []
  },
  "kernelspec": {
   "display_name": "Python 3.9 (tensorflow)",
   "language": "python",
   "name": "tensorflow"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.9.7"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 1
}
